Goto

Collaborating Authors

 llm work


Demystify, Use, Reflect: Preparing students to be informed LLM-users

arXiv.org Artificial Intelligence

We transitioned our post-CS1 course that introduces various subfields of computer science so that it integrates Large Language Models (LLMs) in a structured, critical, and practical manner. It aims to help students develop the skills needed to engage meaningfully and responsibly with AI. The course now includes explicit instruction on how LLMs work, exposure to current tools, ethical issues, and activities that encourage student reflection on personal use of LLMs as well as the larger evolving landscape of AI-assisted programming. In class, we demonstrate the use and verification of LLM outputs, guide students in the use of LLMs as an ingredient in a larger problem-solving loop, and require students to disclose and acknowledge the nature and extent of LLM assistance. Throughout the course, we discuss risks and benefits of LLMs across CS subfields. In our first iteration of the course, we collected and analyzed data from students pre and post surveys. Student understanding of how LLMs work became more technical, and their verification and use of LLMs shifted to be more discerning and collaborative. These strategies can be used in other courses to prepare students for the AI-integrated future.


Making LLMs Work for Enterprise Data Tasks

arXiv.org Artificial Intelligence

Intel Large language models (LLMs) have shown strong performances on natural language (NL) comprehension tasks, from summarization to question answering. The power of these models comes from optimizing for simple self-supervised learning tasks such as next token prediction using massive public web texts as training data on a scalable and adaptive architecture. However, by construction, LLMs know little about enterprise database tables in the private data ecosystem, which differ substantially from web text in structure and content. Given LLMs' performance is tied to their training data [1], a crucial question is how useful they can be in improving enterprise database management and analysis tasks. To help contend with this question, we contribute (1) preliminary experimental results on the performance of LLMs for text-to-SQL and semantic column-type detection tasks on enterprise datasets and (2) a discussion of challenges and potential solutions for effectively utilizing LLMs in enterprise settings.


How ChatGPT and Other LLMs Work--and Where They Could Go Next

WIRED

AI-powered chatbots such as ChatGPT and Google Bard are certainly having a moment--the next generation of conversational software tools promise to do everything from taking over our web searches to producing an endless supply of creative literature to remembering all the world's knowledge so we don't have to. ChatGPT, Google Bard, and other bots like them, are examples of large language models, or LLMs, and it's worth digging into how they work. It means you'll be able to better make use of them, and have a better appreciation of what they're good at (and what they really shouldn't be trusted with). Like a lot of artificial intelligence systems--like the ones designed to recognize your voice or generate cat pictures--LLMs are trained on huge amounts of data. The companies behind them have been rather circumspect when it comes to revealing where exactly that data comes from, but there are certain clues we can look at. For example, the research paper introducing the LaMDA (Language Model for Dialogue Applications) model, which Bard is built on, mentions Wikipedia, "public forums," and "code documents from sites related to programming like Q&A sites, tutorials, etc."